Towards a Better Quality Metric for Graph Cluster Evaluation

نویسندگان

  • Hélio Marcos Paz de Almeida
  • Dorgival Olavo Guedes Neto
  • Wagner Meira
  • Mohammed J. Zaki
چکیده

The process of discovering groups of similar vertices in a graph, known as graph clustering, has interesting applications in many different scenarios, such as marketing and recommendation systems. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality metrics for graph clustering evaluation exist, but the most popular ones have strong biases and structural inconsistencies that cause the quality of their results to be, at least, doubtful. Our studies showed that, while in general those popular quality metrics do a good job evaluating the external sparsity between clusters, they do poorly when evaluating the internal density of those clusters, ignoring essential information (such as a cluster’s vertex count) or having its internal density component ignored in practice because of its computational cost. In this article, we propose a new method for evaluating the internal density of a given cluster, one that not only uses more complete information to evaluate that density, but also takes into consideration structural characteristics of the original graph. With our proposed method, the internal density of a cluster is evaluated in terms of the expected density of similar clusters in that same graph, in contrast to the traditional quality metrics available, where clusters from different graphs are compared by the same standards. We believe that, if used in conjunction with a good external sparsity evaluation metric, like conductance, this method will help to obtain better, more significant graph clustering evaluation results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Avaliação da qualidade de agrupamentos em grafos

The process of discovering groups of similar, connected vertices in a graph, known as graph clustering, has interesting applications in several scenarios, such as biology, marketing and recommendation systems. A major challenge concerning this problem is the evaluation of cluster quality, which is used to measure the e ectiveness of clustering algorithms. Many quality metrics for graph cluster ...

متن کامل

Evaluating the Quality of Clustering Algorithms Using Cluster Path Lengths

Many real world systems can be modeled as networks or graphs. Clustering algorithms that help us to organize and understand these networks are usually referred to as, graph based clustering algorithms. Many algorithms exist in the literature for clustering network data. Evaluating the quality of these clustering algorithms is an important task addressed by different researchers. An important in...

متن کامل

Is There a Best Quality Metric for Graph Clusters?

Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relations...

متن کامل

Fixed points for Chatterjea contractions on a metric space with a graph

In this work‎, ‎we formulate Chatterjea contractions using graphs in metric spaces endowed with a graph ‎‎and‏ ‎‎‎‎investigate ‎the ‎existence‎ ‎of ‎fixed ‎points ‎for such mappings ‎under two different hypotheses‎. We also discuss the uniqueness of the fixed point. The given result is a generalization of Chatterjea's fixed point theorem from metric spaces to metric spaces endowed with a graph.

متن کامل

Graph Hybrid Summarization

One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIDM

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2012